Document compression using rate-distortion optimized segmentation

نویسندگان

Hui Cheng

Charles A. Bouman

چکیده

Effective document compression algorithms require that scanned document images be first segmented into regions such as text, pictures, and background. In this paper, we present a multilayer compression algorithm for document images. This compression algorithm first segments a scanned document image into different classes, then compresses each class using an algorithm specifically designed for that class. Two algorithms are investigated for segmenting document images: a direct image segmentation algorithm called the trainable sequential MAP (TSMAP) segmentation algorithm, and a rate-distortion optimized segmentation (RDOS) algorithm. The RDOS algorithm works in a closed loop fashion by applying each coding method to each region of the document and then selecting the method that yields the best rate-distortion trade-off. Compared with the TSMAP algorithm, the RDOS algorithm can often result in a better rate-distortion trade-off, and produce more robust segmentations by eliminating those misclassifications which can cause severe artifacts. At similar bit rates, the multilayer compression algorithm using RDOS can achieve a much higher subjective quality than state-of-the-art compression algorithms, such as DjVu and SPIHT. © 2001 SPIE and IS&T. [DOI: 10.1117/1.1344590]

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Rate-distortion-based segmentation for MRC compression

Effective document compression algorithms require scanned document images be first segmented into regions such as text, pictures and background. In this paper, we present a document compression algorithm that is based on the 3-layer (foreground/mask/background) MRC (mixture raster content) model. This compression algorithm first segments a scanned document image into different classes. Then, ea...

متن کامل

Multilayer Document Compression Algorithm

In this paper, we propose a multilayer document compression algorithm. This algorithm first segments a scanned document image into different classes such as text, images and background, then compresses each class using an algorithm specifically designed for that class. Two algorithms are investigated for segmenting documents: a general purpose image segmentation algorithm called the trainable s...

متن کامل

Document Image Segmentation and Compression

Cheng, Hui, Ph.D., Purdue University, August, 1999. Document Image Segmentation and Compression. Major Professor: Charles A. Bouman. In the first part of this research, we propose an image segmentation algorithm called the trainable sequential MAP (TSMAP) algorithm. The TSMAP algorithm is based on a multiscale Bayesian approach. It has a novel multiscale context model which can capture complex ...

متن کامل

Compression of Compound Documents

Compound (or mixed) document images contain graphic or textual content along with pictures. They are a very common form of documents, found in magazines, brochures, web-sites etc. Because of the very distinct nature of those two image classes (text/graphics vs. pictures), their compression invariably involves multiple compression systems and a region segmentation (classification) method. We rev...

متن کامل

Optimizing block-thresholding segmentation for multilayer compression of compound images

Compound document images contain graphic or textual content along with pictures. They are a very common form of documents, found in magazines, brochures, Web sites, etc. We focus our attention on the mixed raster content (MRC) multilayer approach for compound image compression. We study block thresholding as a means to segment an image for MRC. An attempt is made to optimize the block threshold...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

J. Electronic Imaging

دوره 10 شماره

صفحات -

تاریخ انتشار 2001

Document compression using rate-distortion optimized segmentation

نویسندگان

چکیده

منابع مشابه

Rate-distortion-based segmentation for MRC compression

Multilayer Document Compression Algorithm

Document Image Segmentation and Compression

Compression of Compound Documents

Optimizing block-thresholding segmentation for multilayer compression of compound images

عنوان ژورنال:

اشتراک گذاری